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Abstract 

Sensor networks are vulnerable to false data injection attack and path-based DoS (PDoS) attack. While 
conventional authentication schemes are insufficient for solving these security conflicts, an en-route filtering scheme 
acts as a defense against these two attacks. To construct an efficient en-route filtering scheme, this paper first presents 
a Constrained Function based message Authentication (CFA) scheme, which can be thought of as a hash function 
directly supporting the en-route filtering functionality. Together with the redundancy property of sensor networks, 
which means that an event can be simultaneously observed by multiple sensor nodes, the devised CFA scheme is 
used to construct a CFA-based en-route filtering (CFAEF) scheme. In contrast to most of the existing methods, 
which rely on complicated security associations among sensor nodes, our design, which directly exploits an en-route 
filtering hash function, appears to be novel. We examine the CFA and CFAEF schemes from both the theoretical 
and numerical aspects to demonstrate their efficiency and effectiveness. 
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I. Introduction 

A Wireless Sensor Network (WSN) is composed of a large number of sensor nodes with limited 
resources. Since WSNs can be deployed in an unattended or hostile environment, the design of an efficient 
authentication scheme is of great importance to the data authenticity and integrity in WSNs. In this 
respect, many authentication schemes have been proposed. The most straightforward way to guarantee data 
authenticity is to use conventional public -key cryptography based digital signature techniques. Although 
the use of public-key cryptography on WSNs has been demonstrated in [20], [22] to be feasible, the 
computation overhead is still rather high for resource-constrained devices. 

Authentication Problem. Sensor networks are vulnerable to false data injection attack [33], by which 
the adversary injects false data, attempting to either deceive the base station (BS, or data sink), and 
path-based DoS (PDoS) attack [10], by which the adversary sends bogus messages to randomly selected 
nodes so as to waste the energy of forwarding nodes*. Several so-called en-route filtering schemes have 
been proposed to quickly discover and remove the bogus event report injected by the adversary. Here, 
"en-route filtering" means that not only the destination node but also the intermediate nodes can check 
the authenticity of the message in order to reduce the number of hops the bogus message travels and, 
thereby, conserve energy. Hence, it is especially useful in mitigating false data injection attack and PDoS 
attack [10], because the falsified messages will be filtered out as soon as possible. 

Related Work. SEF [34] is the first en-route filtering scheme found in the literature that exploits 
probabilistic key sharing over a partitioned key pool. Due to its design strategy, however, only a few 
intermediate nodes between the source-destination node pair have the ability to check the validity of 
forwarding messages, leading to low filtering capability. IHA [38], which verifies the transmitted packets in 
a deterministic hop-by-hop fashion, has also been proposed to authenticate the event report. Nevertheless, it 
requires complicated key sharing among neighboring nodes and could be vulnerable to node compromises 
if node compromises are mounted immediately after sensor deployment. Based on the similar idea used 

*The terms "forwarding node" and "intermediate node" are used interchangeably in this paper. 
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in SEF and IHA, several other en-route filtering schemes are proposed. With the sophisticated use of 
one-way hash chains in clustered sensor networks, DEF [30] has improved filtering power over SEF [34]. 
Using the proposed multiple-axis technique, GREF [32] is designed to support en-route filtering in the 
networks with multiple data sinks. LBRS [35] and LEDS [24] take advantage of location information to 
enhance the resilience to node compromises. CCEF [31], STEF [14], and KAEF [36] are presented to 
authenticate the transmitted packets only in query-based sensor networks. 

Note that, as to broadcast authentication, ^TESLA and its variants [19], [23] can also serve message 
authentication well. Nevertheless, broadcast authentication is used to authenticate only the messages sent 
from the base station while en-route filtering schemes are used for authenticating and filtering a bogus 
event report that is assumed to not be detected by multiple legitimate sensor nodes in a node-to-node or 
node-to-BS communication pattern. Thus, the design of broadcast authentication schemes is orthogonal 
to the content of this paper. 

The Design of En-Route Filtering Schemes. The redundancy property, which means that an event 
can be simultaneously observed by multiple sensor nodes, can be used to design the en-route filtering 
schemes. Specifically, the general design framework is that the source node that senses an event and wants 
to send an event report to the destination node first collects the neighboring nodes' endorsements of the 
sensed event. Afterwards, it sends out the event report and endorsements. Each intermediate node and the 
destination node can check the authenticity of the received report via the verification of the endorsements. 

Aiming to enhance the filtering capability and improve the resilience against node compromises, most 
of the existing en-route filtering schemes rely on complicated security associations (e.g. key sharing), 
and, therefore, incur some assumptions such as secure bootstrapping time, stable routing, single data sink, 
the immobility of sensor nodes, etc, making them impractical. We identify the following four problems 
associated with the existing schemes. 

1) The reason the unnecessary assumptions should be made stems from the fact that the message 
authentication codes (MACs, or keyed hash functions) used do not support en-route filtering 
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functionality, while the authenticity of the forwarding messages needs to be checked by as many 
intermediate nodes as possible. 

2) It has been demonstrated in [6], [9], [26] that the node is able to send an event report to the other 
nodes in certain in-network control scenarios. Nonetheless, the existing schemes, which are only 
effective on the node-to-BS communication pattern, are ineffective in handling false data injection 
and PDoS attacks in such scenarios. 

3) The existing en-route filtering schemes are difficult to apply on mobile sensor networks or networks 
with multiple sinks. In other words, the applicability of en-route filtering schemes on different 
network settings should be improved. 

4) Last, based on conventional design, all the en-route filtering schemes suffer from a special kind 
of DoS attack, false endorsement DoS (FEDoS) attack [15], which could neutralize the advantages 
gained from the use of en-route filtering schemes. 

In this paper, we take a completely different approach to the design of an en-route filtering scheme to 
avoid the above problems. In particular, instead of establishing security associations, we turn to construct 
an en-route filtering hash function, Constrained Function-based Authentication (CFA) scheme, and then 
employ such hash function to generate MACs used to endorse the sensor readings so that each intermediate 
node can verify the authenticity of forwarding messages. In particular, our proposed CFA possesses 
the following four characteristics: 1) Resilience to node compromise (RNC), which means that the 
compromised nodes cannot forge the messages sent from the genuine nodes; 2) Immediate authentication 
(IA), which can be thought of as a synonym to en-route filtering and can be used to filter out the falsified 
messages as soon as possible to conserve energy; 3) Independence to network setting (INS), which means 
that CFA can be applied on the networks with different network settings; 4) Efficiency (EFF), which means 
that CFA has low computational and communication overhead. With these characteristics, a CFA-based 
en-route filtering (CFAEF) scheme can be constructed in such a way that the source node sends to the 
destination node a message, together with the corresponding CFA-based endorsements generated by the 
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neighboring nodes. Afterwards, the source node can determine if the neighboring nodes send the false 
endorsement and each intermediate node has ability to check the authenticity of forwarding messages. As 
a whole, as we will show later, the advantages of applying CFA on MAC generation are that the filtering 
capability can be improved, the resilience against FEDoS attack can be achieved, and the impractical 
assumptions previously made in the literature are no longer required. 
Our Contributions. Our contributions are as follows: 

• A Constrained Function based Authentication (CFA) scheme for WSNs is proposed. CFA can be 
thought of as a hash function directly supporting en-route filtering functionality, and can act as a 
building block for other security mechanisms. 

• A CFA-based En-route Filtering (CFAEF) scheme that can simultaneously defend against false data 
injection, PDoS, and FEDoS attacks is proposed. Particularly, compared with the existing methods, 
which either have low filtering capability or necessitate some unrealistic assumptions, our CFAEF 
scheme can be applied on arbitrary networks without further assumptions. 

• The efficiency of CFA and CFAEF schemes is studied in both theoretical and numerical aspects. 

II. System Model 

Network model. We assume a WSN composed of N resource-limited sensor nodes with IDs, IcN. 
The unique ID for each node can be either arbitrarily assigned in the sensor platform, such as telosB, or 
fixed in a specific sensing hardware when manufactured, like the MAC address on current Network 
Interface Cards (NICs). Although one or multiple base stations (or data sinks) are involved in data 
collection in a WSN, the efficiency of our proposed schemes does not rely on their trustworthiness 
and authenticity. In addition, arbitrary network topology is allowed in our method. Some or all of the 
sensor nodes can have mobility. The network planner, prior to sensor deployment, also cannot gain any 
deployment knowledge pertaining to sensors' locations. 

Security model. The objectives of the adversary are to deceive the BS into accepting the falsified event 
report and to deplete sensor nodes' energy by launching PDoS attack and FEDoS attack. In this paper, 
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sensor nodes are assumed to not be equipped with tamper-resistant hardware. Thus, all the information is 
exposed and can be utilized by the adversary as long as a node is captured. We also assume that the attacks 
such as node compromises can be mounted by the adversary immediately after sensor deployment, i.e., 
the proposed schemes cannot rely on the secure bootstrapping time used in [24], [37]. If required, any pair 
of sensor nodes can establish their shared key^ in a noninteractive fashion [33]. Although sensor networks 
are known to be vulnerable to many attacks such as wormhole attack, selective forwarding attack, etc, 
we refer to the existing rich literature [3], [13], [16], [29] for these issues and the defense against these 
attacks is beyond the scope of this paper. 

III. The Constrained Function Based Authentication (CFA) Scheme 

Since the proposed CFA scheme is constructed by making use of the pairwise key generated by the 
CARPY+ scheme [33] for secure communication, we first briefly review CARPY+ in Sec. IIII-AI to make 
this paper self-contained. Then, the proposed CFA scheme will be presented in the remaining subsections. 
In this paper, nodes u, v, and e are denoted as the source node, destination node, and intermediate node, 
respectively. 

A. Review of the CARPY+ Scheme [33] 

Let N, A, and ¥ q = {0, . . . , q— 1}, where q is a prime number, be the number of sensor nodes, a security 
parameter independent of N, and a finite field, respectively. Let A = (D ■ G) T , where D £ ]p( A+1 ) x ( A+1 ) [ s 
sl symmetric matrix, G £ j^ A+1 ) xAr i s a matrix, and (D ■ G) T is the transpose of (D ■ G). Let K = A ■ G. 
It can be known that K must be symmetric because A ■ G = (D ■ G) T ■ G = G T • D-G = (A- G) T . Before 
sensor deployment, proper constrained random perturbation vectors are selected and applied on each row 
vector of A to construct a matrix W . In addition, G is selected as a Vandermonde matrix generated by a 

^Here, the key establishment scheme in [33], instead of the ones in [5], [7], [8], [11], [17], [21], is chosen to be used in our proposed 
method because the latter are interactive, which means that two nodes require to communicate with each other once they would like to 
establish their common key. 
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seed. The j-th row vector of W, Wj_, is stored into the node j. After sensor deployment, node u can have 
the shared key with node v by calculating the inner product of the row vector W u _ and the v-th column 
vector G- >v , then extracting the common part as the shared key. Note that in the CARPY+ scheme, G 
and s can be publicly known while A should be kept secret. Therefore, CARPY+ can establish a pairwise 
key between each pair of sensor nodes without needing any communication. This property is an essential 
part in constructing the proposed CFA scheme, because establishing a key via communications incurs the 
authentication problem, leading to a circular dependency. 

B, Basic Idea 

In the CFA scheme, the network planner, before sensor deployment, selects a secret polynomial 
f(x, y, z, w) from the set $ (to be defined in Eq. (Q]) later), whose coefficients should be kept as secret, 
thereby constituting the security basis of CFA. For simplicity, we assume that the degree of each variable 
in f(x, y, z, w) is the same, which is d, although they can be distinct in our scheme. For each node u, the 
network planner constructs two polynomials, f u ,i{y, z, w) = f(u, y, z, w) and f u ^(x, z, w) = f(x, w, z, w). 
Since directly storing these two polynomials enables the adversary to obtain the coefficients of f(x, y, z, w) 
by capturing a few nodes, the authentication polynomial auth u (y, z, w) and verification polynomial 
verf u (x, z, w) should be, respectively, constructed from the polynomials f u ,i(y, z, w) and f u ,2(x, z, w) by 
adding independent perturbation polynomials. Afterwards, the authentication and verification polynomials, 
instead of f u ,i(y, z, w) and f u ,2(x, z,w), are stored in node u. For source node u, the MAC attached to 
the message m is calculated according to its own authentication polynomial. Let verification number be 
the result calculated from the verification polynomial verf u (x, z, w) by substituting the claimed source 
node ID, the shared pairwise key, and the hashed message into x, z, and w, respectively. The received 
node considers the received message authentic and intact if and only if the verification difference, which 
is the difference between the received MAC and its calculated verification number, is within a certain 
predetermined range. 

Although our CFA scheme is similar to Zhang et a/.'s scheme [39], the design strategies used in the 
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CFA scheme are different from the ones in [39], except the fact that both rely on polynomial evaluation. 
In Zhang et al.'s, scheme, due to the improper use of perturbation, the nodes' IDs should be forced to be 
changed, resulting in the limitation of hardware dependence. In addition, as an arbitrary secret polynomial 
can be used in [39], immediate authentication can be achieved only if the message authentication code 
forms a polynomial. On the contrary, since the secret polynomial f(x,y,z,w) in CFA is selected such 
that certain properties are satisfied, the message authentication code can be reduced from a polynomial 
size to a single number, resulting in less communication overhead (packet overhead). On the other hand, 
whereas the pairwise key has been considered useless in providing either immediate authentication or 
resilience to node compromises in previous methods, in this paper we find that the pairwise key is 
helpful in enhancing the security while retaining the property of immediate authentication. Hence, all 
these characteristics substantially differentiate CFA from [39]. 

In the following two subsections, the off-line step and on-line step, respectively, will be described. 

C. Off-line Step of CFA scheme 

Before deploying sensor nodes, the network planner picks a parameter q from which a finite field ¥ g 
is built. All of the operations throughout the paper are performed over ¥ q unless specifically mentioned. 
Let X be the set of node IDs. Let £ be the least number of bits sufficient to represent q. Assume that node 
IDs, pairwise key, and hash value can be represented in ¥ q . In addition, a security parameter r < £ is also 
selected. Then, the secret polynomials f(x, y, z, w)'s, used as the basis for constructing both authentication 
and verification polynomials, are defined in constrained function set, where 

$ = {f(x,y,z,w)\\f(x,y,z,w) - f(x,y',z',w)\ < 2 r ~ 1 , \f(x,y,z,w) - f(x',y',z',w)\ > 3 ■ 2 r_1 - 1, 
\f(x,y,z,w) - f(x',y',z',w')\ > 3 • 2 r_1 - l,x,y E I,xV x,y' ^ y, z 7^ z,w' ^ w,r < £} . 

(1) 

The authentication polynomial, auth u (y, z,w) = f(u,y,z,w) + n u ^ a (y,z), and verification polynomial, 
verf u (x,z,w) = f(x,u,z,w) + n U:0 (x, z), are stored in each node u, where polynomials n U)Cl (y,z) and 
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n UiV (x, z), used for perturbation, are randomly selected from the authentication perturbation set, 9^ = 
{n(y,z)\0 < n(y,z) < T'~ 2 — l,y G X, < y, z < q — 1}, and the verification perturbation set, 
D^t, = {n(x, z)\0 < n(y,z) < 2 r ~ 1 — 1, x G X, < x, z < q — 1}, respectively. Though the sets 9t a , 
and 9t D appear to be artificial, they guarantee the efficiency and feasibility of immediate authentication 
of CFA. In addition, constructing auth u (y,z,w) and verf u (x,z,w) from 9I a , and 9I may be time- 
and energy-consuming. Nevertheless, it could be acceptable because such construction is performed only 
by the network planner, instead of sensor nodes. If the time required for constructing auth u (y, z, w) and 
verf u (x, z, w) is still an issue that cannot be ignored, an efficient method for constructing the polynomials 
in a restricted version of $ will be later discussed in Sec. IIII-El The off-line procedure of CFA is described 
in Fig. CD 

Algorithm: CFA-Off-line-Step(g, r) 

1. Randomly picks a secret polynomial f(x,y, z,w) G 5 

2. for each node u 

3. Randomly picks n u>a (y, z) G 9I a and n u ^(y, z) G *n„ 

4. Store auth u (y, z, w) := f(u, y, z, w) + n Uta (y, z) 

5. Store verf u (x, z, w) := f(x, u, z, w) + n U)V (x, z) 

Fig. 1. Off-line Step of CFA. 

D. On-line Step of CFA scheme 

After sensor deployment, the sensor node may work as a source node, intermediate node, or destination 
node depending on whether the message is to be sent or verified. In the following, we describe the 
operations one should perform when the node acts as different roles. It should be noted that the pairwise 
key K UiV = K VjU , used here, is constructed by applying the CARPY+ scheme [33] on nodes u and v, 
respectively. 
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Source node (Message transmission). When node u wants to send a message m to node v, it calculates 
the message authentication code: 

MAC u (v, m) = auth u (v, K UtV , h(m)) + n U)S , 

where n U)S is randomly picked from the set {0, . . . , 2 r ~ 2 }. Then, the packet M. = (u, v, m, MAC u (v, m)) 
is sent to v possibly through a multi-hop path. Note that the message authentication code MAC u (v,m) 
is only a number here. 

Destination node (Message verification). After receiving the packet M = (u,v,m, MAC u (v,m)), 
the destination node v first calculates the verification number: 

verf v (u,K V)U ,h(m)), 

according to its own verification polynomial verf v (x,z,w) and then calculates the corresponding 
verification difference, VD VtU : 

VD V>U = \verf v (u,K v>u ,h(m)) - MAC u (v,m)\. 

If VD VjU is within the range {0, . . . , 2 r ~ 1 — 1}, where r is a security parameter mentioned in Sec. IIII-CL 
then the authenticity and integrity of the packet M. is successfully verified. Otherwise, the packet M. is 
dropped. The principle behind this step is as follows: 

verf v (u, K VjU , h(m)) - MAC u (v, m) 
=(/(tt, v, K VjU , h(m)) + n V)0 (u, K V}U )) - (f(u, v, K U}V , h(m)) + n u>a (v, 
=(f(u, v, K VjU , h(m)) - f(u, v, K UjV , h(m))) + {n i)X) {u, K V)U ) - (n Uja (v, K UjV ) + n U)S ) 
=n i>t> (u, K VjU ) - (n Uta (v, K u>v ) + n U)S ). (2) 

^From the rules of constructing authentication and verification polynomials, we know that n iitl (u, K iiU ) G 

{0, . . . , 2 r ~ l - 1}, n u>a (v, K UtV ) G {0, . . . , 2 r ~ 2 - 1}, and n UtS G {0, . . . , 2 r ~ 2 }. Thus, when M is genuine, 
the verification difference VD V , U = \verf v (u, K v>u , h(m)) — MAC u (v, m) \ must be within {0, . . . , 2 r ~ 1 — 

!}• 
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Intermediate node (Message verification). After receiving the packet M = (u,v,m, MAC u (v,m)), 
the intermediate node e first calculates verf £ (u,K itU ,h(m)) according to its own verification polyno- 
mial verf £ (x,z,w) and then calculates the verification difference VD £jU = \verf £ (u,K £tU ,h(m)) — 
MAC u (v,m)\. If VD £U is within the range {0,...,2 r — 1}, then the authenticity of the packet M. 
is successfully verified, and the packet M. will be forwarded by node e. Otherwise, the packet M. is 
dropped. The principle behind this step is as follows. When a genuine packet M. is received, we can 
obtain: 

verf £ (u, K EjU , h{m)) - MAC u (v, m) 
= (f(u, e, K EjU , h(m)) + n E , (u, K £)U )) - (f(u, v, K U)V , h(m)) + n UA {v , K u>v ) + n U)S ) 
= (f(u, e, K EjU , h(m)) - f(u, v, K %v , h(m))) + (n £t0 (u, K itU ) - (n u , a (v, K u>v ) + n u , s ). (3) 

By the construction of we know: 

\f{u,e,K £ ^h(m))- f(u,v,K u ^h{m))\<2 T -\ (4) 

In addition, from the rules of constructing authentication and verification polynomials, we know that 
n e ,„(u, K EtU ) G {0, . . . , T^ 1 - 1}, n u , a (v, K u , u ) G {0, . . . , 2 r " 2 - 1}, and n u , s G {0, . . . , 2^ 2 }. Therefore, 
the verification difference VD £:U must be within {0, . . . , 2 r — 1}. 

On the other hand, consider the case where node u has been compromised by the adversary. The 
adversary now wants to deceive v that a message m sent by u is sent by u' ^ u. Consider the modified 
packet, 

M' = (u',v,m,MAC u (v,m)), (5) 

where u' means a node ID the adversary pretends to be. Note that we only consider the adversary who 
exploits the information obtained from a single captured node u, and focus on the use of the constructed 



set J. The verification procedure at the intermediate node e is as follows: 

verf e (u', K e y, h{m)) - MAC u (v, m) 
=(f(u', e, K e>u ,, h{m)) + n EtX> (u', K e ^)) - (f(u, v, K UjV , h{m)) + n U)a (v, K U)V ) + n U)S ) 
=(/(</, e, K E)U ,, him)) - f(u, v, K UjV , h(m))) + (n e>t> (u', K e ^) - (n Uja (v, K U)V ) + n UjS ). (6) 

By the construction of we know: 

\f(u', e, K £ y, h{m)) - fiu, v, K u>v , h(m))\ > 3 • 2 r " 1 - 1. (7) 

In addition, from the construction of authentication and verification polynomials, we know that 
7i e , B (ii', K ey ) G {0, . . . , - 1}, n Uta iv, K u , tV ) G {0, . . . , 2 r ~ 2 - 1}, and n u , s 6 {0, . . . , 2 r ~ 2 }. Therefore, 
the verification difference VD e u must be not within {0, . . . , 2 r — 1} and the packet M.' will be dropped. 
In other words, once the source node ID of a message is modified, such malicious manipulation will be 
deterministically detected by the intermediate nodes. The on-line procedure of CFA is described in Fig. 

El 



E. Implementation Issues 

The effectiveness and efficiency of the proposed CFA scheme rely on the use of auth u iy, z,w) and 
verf u ix,z,w), which satisfy the constrained function set the authentication perturbation set 9T a , and 
the verification perturbation set 9T . As the construction of 0^ and 91^ is relatively easy, in this section, we 
focus on the construction of auth u iy, z, w) and verf u ix, z, w), with particular emphasis on the construction 

of f(x,y,z,w). 

A straightforward method for deriving proper f(x, y, z, w) is to construct the whole set $ and then 
randomly pick one from When the coefficients of the polynomials in J are constrained with W q , 
there are g( d+1 ) 4 possible four-variate d-degree polynomials. Thus, 0(g 2 '( d+1 ) ) tests are required because 
there are g( d+1 ) 4 four-variate rf-degree polynomials, each of which needs to check whether it satisfies the 
constraints \f(x, y, z, w) — fix', y', z', w')\ > 3 • 2 r ~ 1 — 1, y, z, w) — fix', y', z',w)\>3- 2 r_1 — 1 and 
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Algorithm: CFA-On-line-Step 

Scenario: node u sends a message m to node v 

Source node u: 

1. Calculate K u>v and h(m) 

2. Compute MAC u (v, m) := auth u (v, K UjV , h{m)) + n UjS , 
where n u s is randomly picked from {0, . . . , 2 r ~ 2 — 1} 

3. Send the packet M. := (u,v,m, MAC u (v,m)) 
Intermediate node e (on receiving M): 

1. Calculate K U)C and h(m) 

2. Calculate VD £>U := \verf £ (u, K £:U} h(m)) - MAC u (v,m)\ 

3. if VD £iU E{0,...,2 r -1} 
then forwarding M else drop M 

Destination node v (on receiving M): 

1. Calculate and h(m) 

2. Calculate VD V U := \verf v (u, K VjU , h(m)) — M AC u (v, m)\ 

3. if VD^eiO,...,?- 1 -!} 
then accept M else drop M 

Fig. 2. On-line Step of CFA. 

\f{x, y, z, w) — f(x, y', z', w) \ < 2 r_1 in J, by examining the other q( d+l ^ 4 — 1 possibilities of different input 
variables. The above construction of $ will be accomplished before sensor deployment by the network 
planner that is usually assumed to be resource-abundant, thus, feasible. Despite its feasibility, such an 
exhaustive search is not a sufficiently efficient method. In the following, we develop an efficient algorithm 
trading the deterministic security for the construction efficiency on the basis of the observation that, in 
some cases, a variant of 5 is sufficient for our use and the search for a variant of J can accelerate the 
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construction of f(x,y,z,w). Hence, we emphasize on how to efficiently construct a variant of the 
original constrained function set 

Let be the weak constrained function set as follows: 

& = {f(z,y,z,w)\\f(x,y,z,w) - f(x,y',z',w)\ < 2 r_1 , 

x,yEl,x'^x,y'j£y,z'^z,w'^w,r<£}. (8) 

Obviously, J is a subset of since some constraints in $ are discarded. As to the construction of 
f(x,y,z,w) in our idea is to construct a random subset of that is as large as possible and to 
sample a polynomial from it. Assuming that x E [x min ,x max ],y E [i/ m in,!/max],2 G [z min , z max ], and 
to G [wmim w max ], we want to construct a polynomial f(x,y, z,w) satisfying the constraints in which 
will be shown as follows. 

Assume that f(x,y,z,w) = Y^,i,k,m=Q a i^k,mX % y 3 z h w m , d E Z+, a i>j>kjm E ¥ q . According to the 
definition of f(x, y, z, w) in f(x, y, z, w) can be rewritten as: 

d d 

Y «;,o,o,m^ m + Y a itjAm x i y j z k w m . (9) 

i,m=Q i,m=0,j,k=l 

With the representation in Eq. ©, the term f(x, y, z, w) — f(x, y' , z' , w) can be written as: 

d d 
Y <^i,ofi,mX i w m + Y a idAm x i y j z k w m 

i,m=0 i,m=0,j,k=l 

(d d 
Y a ifi,0,mX l W m + Y <^,k,rr0 i (y') j (z') k W m 
i,m=0 i,m=0,j,k=l 
d 

= Y '>'•>'—'•'"•"' (y' zk - iv'Wt) ■ do) 

i,m=0 ,j,k=l 

According to the signs of a^^'s, Eq. (TTOl ) can be further rewritten as: 

d 

Y (" /.././■'.!/''' ' " '" {y J z k - (vW)*) + <^ A > B (v*** - (vOW)) , (id 

j,m=0j',fc=l 



where 



&i,j,k,mi If ^i,j,k,m ^ 0, if Q!j j ^ jjj !> 

and «.^,fc >m = s 

0, if (Xij,k,m < a i,j,k,rn, if ®i,j,k,m < 
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By taking the constraint \f(x, y, z, w) — f(x, y', z', w)\ < T 1 in and Eq. (fTTT) into consideration, we 
have: 

d 

-r- 1 < (".:,./, (y Jzk - (vOW) + W u,m ~ WCO*)) < r- 1 - (12) 

i j'=0,fc,m=l 

With — 2 r_1 and 2 r_1 being the lower bound and upper bound of f(x, y, z, w) — f(x, y', z', w), respectively, 
Eq. (fl2l can be rewritten as: 

mm {E- m =o,i, fe =i «, fc , m ^ m {V j z k ~ GO W) + '>,,.,,„•'" '<•'" - (l/W)*)) } > -2 r " X 

(13) 

Here, we define [/+] = {(i,j,k,m)\f(x,y,z,w) = Yi,j,k, m =o a %hKrnX i y j z k w m ,a idAm > 0} and [f~] = 
{(i,j,k,m)\f(x,y,z,w) = Ylij,k,m=o a i,j,k,mX i y j z k w m ,a idAm < 0}. We can examine if a given set of 
«ij,fc,m's, Vi,j,h,m, constitutes a polynomial f(x,y,z,w) of g"' by exploiting the definitions in Eq. © 
and considering the extremes in Eq. (fT3l) shown as follows: 

^li,m=0,j,k=l (^i,j,k,rn'^m&x^ma,x ^min^min 

+r>~ r* 7/; m (v j z k —vi z k M < 2 r ~ 1 

1 ij,t,m mai max I fmin mm Wmax max / / — 
J^j,m=0J,fc=l ^ij^^m^max^max ^min^min fmax^max 

+<t~ , ?/; m I?/- 3 z k — v j z k ■ I I > — 2 r_1 

1 ij,fc,m max^max I i>max max Wmin mm / / — 

Define /'(a;, y, z, w) = J^tj k m=o a 'i,j,k,m xl y jzkwm > which is only different from f(x, y, z, w) in the part of 
coefficients. From Eq. (fl4l) . we can observe that the possible range of \f'(x, y, z, w)—f'(x, y', z', w) | will be 
contained in \ f(x, y, z, w)-f(x, y', z', w)\, i.e., max{/'(i, y, z, w)-f'(x, y', z', w)} < max{/(i, y, z, w)- 
f(x,y',z',w)} and mm{ f ' (x, y, z,w) - f'(x,y', z',w)} > mm{f(x,y,z,w) - f(x,y',z',w)}, if (i) 
<Xij,k,m - a i,j,k,m > °> V(i,j,fc,m) G [/+], or (ii) a i>jAm - a' ijkm < 0, \/(i,j,k,m) E [f~]. With 
this monotone property, our algorithm, randomly sampling a polynomial from a random subset of 
whose pseudo code is shown in Fig. |3j can be described as follows. 

As oa,j,k,m in f(x,y,z,w) denotes the coefficient of x l y j z k w m for specified i,j,k,m, we use [a] to 
denote an instance of a^^m's, \/i,j,k,m. At the beginning of ^'-Construction algorithm shown in Fig. 



(14) 
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Algorithm: ^'-Construction ([«] is the final output of this algorithm) 


1. 


randomly select [a] and construct \f + ] and If ] 


2. 


while [al cannot satisfy Eq. (1741) 

L J ^ " M—T 


3. 


[a] := [a/2] 

L J L / J 


4. 


randomly construct Vt := Hi, j, k,m)\0 < i, j, k,m < d} with \Q\ > 0, and set \6] = \a] 


5. 


for each element (i, j, k,m) in f2 


6. 


if (i,j,k,m) G [f+] 

\ 7 U 7 7 / L'' J 


7. 


find the maximum </> such that ([al, (i, j, k, m), ip) is satisfied with Eq. (fl4l) 


8. 


(j) is randomly selected from [0,<p], and set [a] := ([a], (i,j,k,m),(j)) 


9. 


if (i,j,k,m) G [/"] 


10. 


find the minimum such that (lal 7 t??,) co) is satisfied with Ro (1 1 41) 

in iu liiv iiiiiniii Liiii y juvii uiui \ i — t. - I fc^ j j 5 / ? / iau jiiwu vv i lii j — /\j • yj i_ t |/ 


11. 


is randomly selected from [0,<p], and set [a] := ([a], (i, j, k,m), 4>) 


• 5'- 


Construction algorithm 



[3l we randomly choose [a] and determine if the chosen [a] satisfies Eq. (fT4l) . If [a] fails to satisfy Eq. 
CHJ), [a] = [a/2] is checked recursively until Eq. (IT4|) is satisfied (Lines 1~3). Here, [a/2] consists of 
[ a '' 3 ' fc ' m j 's, where each a it j^ m is an element in [a]. Note that the loop (Lines 2~3) is guaranteed to 
terminate at a certain step because at least the setting of aij^m = 0, Vi,j,k,m, is satisfiable. With the 
monotone property, we can also guarantee that any polynomial sampling from {[a']|[a'] ^ [a]} is one of 
the polynomials in Here, [a'] ^ [a] means that the possible range of \f'(x,y,z,w) — f'(x,y',z',w)\ 
will be contained in \ f(x, y, z, w) — f(x, y' , z' , w)\. Thus, after the execution of Line 3, we can sample a 
polynomial f{x, y, z, w) G #' from the sample space {[a'] | [a'] ^ [a]}. Nevertheless, we can, in fact, further 
extend the sample space by tuning selected otij^m's (Lines 5~1 1). For example, suppose \Q\ a^jt^'s 
are chosen to be tuned. In particular, defining ([a], (i,j,k,m),(p) as [a] whose atijk,m ls selected to be 
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replaced by cp, we can extend the range of \f'(x,y,z,w) — f'(x,y',z',w)\ by maximizing (minimizing) 
the selected Oiijk,m if (hj,k,m) E [f + ] ((i,j,k,m) E [f~]) so that the size of {[a']|[ct'] -< [a]} will be 
increased. Together with [a] obtained after Line 3, Lines 8 and 1 1 behave like sampling a polynomial from 
a subset of J', which could be randomly different due to the random construction of f2 (Line 4). Note that 
a search of maximum a?ij,fc, m (Line 8) can be accomplished by conducting binary search on the positive 
integers greater than a^j^^. The minimum acij^m can be found in a similar way (Line 11). Since we 
should conduct binary search once for each element in Q, the running time of ^'-Construction algorithm is 
0(| fi| log 5). Indeed, from the theoretical point of view, it might obtain only a useless constant polynomial 
after the execution of ^'-Construction algorithm, therefore, require executing the algorithm multiple times. 
Nevertheless, in practice, when a sufficiently large security parameter r (as defined in Eq. (OQ) and Eq. 
©) is selected, executing the algorithm once is sufficient for sampling a non-trivial polynomial from 
It should be noted that ^'-Construction algorithm is not a uniform sampling over #'. As we mentioned 
earlier, what we do is to construct and then sample from a random subset of #'. Nevertheless, due to 
the use of f2 with the purpose of tuning randomly selected a^^'s, we can still guarantee that there is 
a nonzero probability of each polynomial in 5' being sampled, resulting the sufficient security against 
directly guessing all the coefficients a^j^m's. 

When f(x, y, z, w) is selected from the weak constrained function set, J', the filtering capability will be 
slightly reduced. Its impact on the security of CFA using f(x, y, z, w) E is discussed in the following. 
Even if f(x, y, z, w) is selected from the destination and the intermediate nodes, when receiving the 
genuine message, can still correctly accept and forward the received message, respectively. The validation 
procedures are the same as those in Eqs. © and ©, therefore, are omitted here. The destination node and 
intermediate nodes, however, only probabilistically drop falsified messages in CFA using f(x, y, z, w) E 
instead of deterministically dropping the modified messages in CFA using f(x, y, z, w) E The principle 
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behind this change is as follows: 

verf £ (u', K e>u >, h(m)) - MAC u (v, m) 
= (/(«', e, K EtU/ , h{m)) + n £t0 (u, K e y)) - (f(u, v, K UjV , him)) + n u>a (v, K UyV ) + n U)S ) 
=(f(vf, e, K E>u/ , him)) - f(u, v, K U)V , h{m))) + (n E)t> (u', K e y) - (n u ,a(v, K UjV ) + n U)S )). (15) 

The first term (/(w', e, K £ U i, him)) — fiu,v,K UjV ,him))) on the RHS of Eq. (fl"5l) can be an arbitrary 
element in ¥ q , leading also to the arbitrariness of the final result in Eq. (fl"5T) . Therefore, the probability 
that \verf e iu', K e>u *, him)) — MAC u iv, m)\ happens to be within the range [— 2 r + 1, 2 r — 1] is increased 
from to 2r ~ 1 . With a similar argument, one can also show that the probability of detecting falsified 
messages is 

IV. CFA-Based En-Route Filtering Scheme 

With CFA described in Sec. HHl the design of CFA-based en-route filtering (CFAEF) scheme is 
straightforward. The CFAEF scheme consists of three phases: node initialization phase, report endorsement 
phase, and en-route filtering phase, which, respectively, will be described as follows. 

Node initialization phase. At first, a global security parameter t, which indicates the maximum number 
of compromised nodes tolerable in the CFAEF scheme, is selected. If the number of compromised nodes 
exceeds t, then the adversary can inject falsified data without being detected. It should be noted that such 
a limitation is also applied to all en-route filtering schemes unless additional location information is used. 
In addition, each node u is preloaded with auth u iy, z, w) and verfjyx, z, w) prepared for the use of CFA. 
Last, the sensor nodes are deployed on the sensing region. 

Report endorsement phase. After sensor deployment, a node enters this phase when it has an event 
report to be sent*. More specifically, once a node u wants to send an event report E to a destination 
node v, it first broadcasts E in plaintext to the nodes neighboring to u. If the neighboring node /x 

''An event could be simultaneously observed by multiple nodes. Here we assume that one of these detecting nodes is responsible for 
sending the event report, but the election of such node is beyond the scope of this paper. 
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agrees with E, then it generates a MAC, MAC^(v,E) via the proposed CFA scheme, and sends an 
endorsement of E, MAC^{v,E), back to u. After collecting t MACs from the neighboring nodes§, 
Hi, . . . , fit, u first checks whether the value of \verf u (fij, E) — MAC H (v, E)\, j = 1, . . . , t, is within 
the predetermined range [0, 2 r — 1]. If some of the collected endorsements, MAC H (v,E), fail to be 
verified, u drops all MAC^ (v , E)'s and acquires other endorsements from the neighboring nodes 
other than ^i . . . , fi t . Only when all of t collected endorsements are successfully verified, u forwards 
(E, u, v, MAC u {v, E),m, MAC^ (v, E), . . . , fit, MAC„ t (v, E)) to v. 
En-route filtering phase. Once receiving the packet 

{E, u, v, MAC u {v, E),fii, MAC^v, E),...,fit, MAC^v, E)), 

the intermediate node e first checks whether the attached endorsements are generated by t + 1 distinct 
nodes. The packet is dropped if the verification fails. Afterwards, for each v of the t + 1 endorsements, 
node e checks whether VD e v = \verf £ {y, E)— MAC v {v, E) \ is within the predetermined range [0, 2' — 1]. 
Only if node e succeeds in verifying all the t+1 endorsements, is the packet forwarded. Otherwise, the 
packet is dropped. The operation performed by the destination node v is similar to that performed by 
the intermediate node. The difference is that v checks whether VD V>U = \verf v (u, E) — MAC v {v,E)\ is 
within the predetermined range [0, 2 r ~ 1 — 1]. Only if v succeeds in verifying all the t + 1 endorsements, 
is the event report E accepted. Otherwise, the packet is dropped. 

V. Performance and Security Evaluation 

In this section, for CFAEF, in addition to analyzing the overhead (Sec. IV-AD , we study its security (Sec. 
IV-BI) and compare the energy saving with the other methods (Sec. IV-Cb . 

A. Overhead Analysis 

As to the storage overhead, two trivariate polynomials need to be stored in each node in CFA, as shown 
in Fig. [TJ Therefore, in CFAEF, the storage overhead 0(d 3 ) is required due to the use of authentication 

''The WSNs in our consideration possess high node density such that t-coverage [12], [27], [28] can be achieved. 
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and verification polynomials. 

For the endorsing node, the computation overhead comes from the calculation of the message 
authentication code, which involves trivariate polynomial evaluation and requires 0(d 3 ) arithmetic 
operations [4], [25]. On the other hand, the computation overhead for the source node, intermediate 
nodes, and destination node is the same, which is 0(td 3 ), because t MACs should be calculated. 

As to the communication overhead of CFAEF, the source node has to communicate with 
the neighboring nodes to obtain the endorsements. Moreover, the source node has to send 
(E,u,v,MAC u (v,E),ii 1 ,MAC^ 1 (v,E),...,iJ l t,MAC^(v,E)), instead of (E,u,v), to the destination 
node. As a result, the additional communication overhead incurred by the use of CFAEF is 0(tH), where 
H is the average number of hops between two arbitrary nodes in a network. 

B. Security 

First, we study the security of the proposed CFA scheme. In particular, we assume that the adversary 
attempts to recover the coefficients of f(x,y,z,w). Consider an adversary who can only modify the 
transmitted packet and re-transmit the modified one in order to deceive the destination node into accepting 
that the packet originates from the other node or that the message is authentic. The probability of the 
adversary successfully deceiving the destination node can be analyzed as follows. If the message m 
with MAC u (v,m) sent by the node u is modified to m' ^ m or u' ^ u, then we can know that the 
probability that the intermediate node forwards the message m' is at most 2r+ ^" 1 and the probability that 
the destination node accepts the message m' is at most ^y^- This can be explained by the fact that, to 
deceive the destination node, the best strategy that can be adopted by the adversary is to forge the MAC 
corresponding to m! and u'. Nonetheless, such MAC can only be guessed by the adversary. Therefore, 
the verification difference would be arbitrary and the probabilities that VD £ U and VD V U happen to be 
within the predetermined ranges are at most 2r+ t |~ 1 and for the intermediate node and destination 
node, respectively. 
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Second, we consider the case where the adversary not only eavesdrops on the transmitted messages but 
also compromises n nodes to use the security information stored in them, trying to recover the coefficients 
of f(x,y,z,w). We can know that the adversary cannot break f(x,y,z,w) if only n < d nodes are 
compromised [2]. When the adversary has compromised n > d nodes, the complexity for it to obtain the 
coefficients of fix, y, z, w) is il(q d+1 ). This can be explained as follows. Assume that u , . . . , w n _i are n 
compromised nodes. Let x , z , and w be arbitrary elements in ¥ q . We know that if we can arbitrarily 
construct f(x , y, z , w ) for any x , y , and w , then the coefficients of f(x, y, z, w) can be inferred by 
solving a system of equations. Thus, our goal is to obtain the coefficients of f(x ,y,z ,w ). Note that 
the discussion and effect of obtaining the coefficients of, for example, f(x,yo, Zo,wq) is the same as that 
of obtaining the coefficients of f(xo,y,Zo,wo). Thus, we omit the former case and focus only on the 
latter case here. We can know that f(x ,y,z ,w ) can always be written as C^yK Based on the 

construction of verf u (x,z,w), we can derive the following n equations: 

d 

S ^C j {u i y = verf Ui (x , z ,w ) - n UiiV (u h zq), < % < n - 1. (16) 

j=0 

In this system of equations, there are d+l+n unknown variables including Cj (0 < j < d) and n Uii0 (ui, z ) 
(0 < i < n — 1). There are, however, only n equations. Thus, d+1 unknown variables should be eliminated 
or correctly guessed. The polynomials, auth Ui (y, z,w)'s, may be used by the adversary to reduce the 
number of unknown variables. A common method that is able to reduce the number of unknown variables 
is called reflection attack in [39] and is employed here. Let = verf Ui (uo, zq, wq) — auth Uo (ui, zq, wq) = 
n Ui ,t>(ui, z ) - n UOta (ui,z ). The above equation can be rewritten as n UijX) {u h z Q ) = a* + n U0)a {u u z Q ). 

Together with this equation, Eq. ([TBI) can be represented as: 

d 

y ^2 / C j {u i y = verf Ut (x , z ,w ) -ai~ n Uo>a (ui, z ),0 < i < n - 1. (17) 

It can be observed that reflection attack does not work in breaking f(x,y,z,w) with higher probability 
because there are still d+l+n unknown variables in n equations. Thus, d+1 unknown variables should 
be eliminated or correctly guessed. Since each unknown variable can be of at least r bits length, the 
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complexity of recovering the coefficients is £l(2 r ( d+1 ^). 

After the security of CFA is established, the resilience of CFAEF against false data injection attack, 
PDoS attack, and FEDoS attack is obvious. For example, CFAEF is resilient to false data injection attack 
and PDoS attack because, with the MACs generated by CFA, the false data can be detected and dropped 
by either intermediate nodes or the destination node when the number of compromised nodes does not 
exceed t. In particular, with t endorsements required, the probabilities of detecting the bogus message 
on each intermediate node and the destination node are ( 2r and respectively. On the other 

hand, FEDoS attack is useless because if the compromised node sends a false endorsement to the source 
node, the source node, acting as the intermediate node between the endorsing node and destination, can 
identify the false endorsement via the CFA verification, and refuse to communicate with the compromised 
node thereafter. 

C. Energy Savings 

In this section, the energy consumption model similar to that used in [30] is used to analyze the energy 
savings of various schemes. Due to the fact that, the higher the filtering capability, the lower the energy 
consumed for forwarding falsified messages, the evaluation of energy consumption is somewhat equivalent 
to the evaluation of the filtering capability. 

As described in Sec. H most of the existing en-route filtering schemes require strict assumptions. For 
example, IHA [38] and GREF [32] heavily rely on the sophisticated security association that must be 
established within a period of secure bootstrapping time, which is unrealistic in certain cases. Moreover, 
some schemes [24], [32], [35] require location information and some others [14], [31], [36] work only 
on query -based networks. Therefore, in the following, we emphasize the energy consumption comparison 
among SEF [34], DEF [30], and our CFAEF scheme, because SEF and DEF achieve the balance among 
efficiency, filtering capability, and generality while minimal assumptions are required. 
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In particular, [30] shows a general formula 

E = L r (H + — ), (18) 
P 

for evaluating the energy consumption E of report forwarding. Here, L r , H, (3, and p denote the bit- 
length of the report plus endorsements, the average number of hops between two arbitrary nodes, the 
ratio of the false report to the legitimate report, and the probability of detecting the false report on each 
node, respectively. Note that, as demonstrated in [1], the energy consumed by the communication is over 
1000 times greater than that consumed by the computation. Thus, the above formula emphasizes on the 
calculation of energy consumption incurred by the communications. Let eo r d be the energy consumption 
of report forwarding when no filtering scheme is used. Let esEF and c^ef be the energy consumption of 
report forwarding when SEF [34] and DEF [30] are used, respectively. Throughout the energy evaluation, 
the common parameters t — 5, j3 = 10, MAC size, 64 bits, and the byte-length of the report, 24 bytes, 
were used for the methods adopted for comparisons. For [30], with default parameter settings, we know 
that e 0r d = 2U2H, e SE F = 306(# + 200), and e DEF = 732(# + 36). According to Eq. CLE]), with similar 
calculation 11 to [30] and the setting of q = 127 and r = 120, the energy consumption e C FAEF in CFAEF 
can also be derived as cqfaef — 512(if + 10), because in this case p = 1 and L r = 24 x 8 + 5 x 64 = 512. 
In particular, when H = 50, our scheme saves 1 — EcFAEF ^ 71% of energy than the scheme without 
using filtering, 1 - Ec E F s A E E F F = 60% of energy than SEF, and 1 - E % FA ^ F = 51% of energy than DEF. 

VI. Conclusion 

A Constrained Function based message Authentication (CFA) scheme, which can be thought of as 
a hash function directly supporting en-route filtering functionality, is proposed. According to CFA, 
we construct a CFA-based En-route Filtering (CFAEF) scheme to simultaneously defend against false 
data injection, PDoS, and FEDoS attacks. Some theoretical and numerical analyses are provided to 

"In [30], the packet length is only calculated based on counting the lengths of the report and MACs excluding the lengths contributed 
from the source node ID, destination node ID, and endorsing nodes IDs. 
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demonstrate the efficiency and effectiveness of CFAEF. 
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